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Abstract - In this paper a machine vision approach 
is applied to IFSAR data to extract the most relevant 
built structures in a dense urban environment. The algo- 
rithm tries to cluster primitives (line segments) into more 
complex surfaces (planes) to approximate the 3D shape 
of these objects. Very interesting results starting from 
TOPSAR data recorded over S. Monica are presented. 

1. INTRODUCTION 

The urban environments, with their complex structure 
composed of buildings of different kinds and shapes, small 
and/or large green areas and infrastructures have con- 
stantly been a challenge for remote sensing analysts. De- 
spite the large number of works on the interpretation of 
urban images acquired by different sensors, from the clas- 
sic cameras to Synthetic Aperture Radars (SAR) [1], from 
multispectral to hyperspectral sensors (like AVIRIS), a 
large amount of information is still hidden in the raw data. 
In particular, very few papers are devoted to the use of In- 
terferometric SAR (IFSAR) measurements for urban im- 
age analysis: one of them is [2], where IFSAR and AVIRIS 
data are merged to better distinguish buildings from green 
areas. Indeed, the 3D measurements obtained by IFSAR 
may be extremely useful for extracting the complete to- 
pography of a urban environment as well as for gathering 
more insight on particular structures. 

Analysis of the IFSAR terrain elevation data in urban 
areas are usually difficult due to the insufficient spatial 
resolution, multiple scattering due to the building geome- 
tries, and layover effects, in addition to the intrinsic IF- 
SAR system level noise. Therefore, there is still a strong 
need to evaluate which type of information is available 
from these data and to what extent it is possible to ex- 
tract them. The resolution problem is being increasingly 
resolved by the new generation of radar sensors opera- 
tional in the near future [3]: the goal of these systems is to 
provide a 1-meter level spatial resolution, which therefore 
can resolve many of the objects present in an urban en- 
vironment. As for the second problem, instead, we found 
very interesting to apply to the original remote sensing 
images some suitable machine vision approaches. Indeed, 
even if developed for very different situations, these pro- 
cedures are of invaluable utility when used in this context. 

The paper is organized as follows: Section 2 presents 
the building extraction algorithm, Section 3 shows the 



Figure 1: A range image of a part of Wilshire Boule- 
vard, Santa Monica derived from TOPSAR interferomet- 
ric measurements. 


experimental results, while in Section 4 these results are 
discussed and the future lines of research are expressed. 

2. THE BUILDING EXTRACTION 
ALGORITHM 

In this work we focus on the task to extract information on 
urban structures from high resolution IFSAR data: specif- 
ically, we want to automate the detection of the height and 
shape of the buildings present in a given area. To this 
aim, we apply to the original data a machine vision seg- 
mentation algorithm able to exploit their resolution, while 
maintaining at the same time a high robustness to noise. 
In particular, the criteria applied to segment the raw data 
are geometric ones, involving the principle of plane-fitting 
(i.e. to find the plane which better approximates a given 
surface): in our situation this approach corresponds to 
look for the building roofs. To this aim, the simplest al- 
gorithm could be an iterative region growing approach, 
that is, we may start from randomly chosen pixels and 
examine all the adjacent one; if one pixel is near to the 
sample in the 3D space, it is added. This idea can be 
further improved by the algorithm outlined in [4]: in this 
approach the primit ives of segmentation are the scan lines 




Figure 2: Classification results for the area in fig. 1 by 
means of the building extraction algorithm outlined in 
the text: note the regularization of the profile of the fore- 
ground buildings. 


(lines of the image), and they are aggregated in planes by 
considering some suitable geometric properties. 

The algorithm starts from primitives of segmentation 
that are the scan lines (lines of the image), in order to 
save cpu time. Then it works by means of three processing 
steps, that here we simply recall. 


1. First, we group the pixels of each line into segment 
according to the following geometric criterion: a line 
is broken in the point furthest from the chord joining 
the extremes, provided that this distance is greater 
than a given threshold. This procedure is iterated 
until it is possible to find a breakpoint. Each segment 
found is then recorded in a double-linked list, with 
pointers to its neighbors (i.e. adjacent segments). 
The threshold is found by considering the local noise 
variance, computed on a 3 x 3 window around each 
pixel pij as the mean square difference between the 
data and the best fitting plane 
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2. The second step consists in finding the seeds for the 
planar regions that we want to use to characterize 
the original image. Each seed is formed by three ad- 
jacent segments (longer than a given threshold) each 
belonging to a different scan line. The optimal seed 
is chosen as the one nearest to the ideal condition of 
three segments belonging to the same planar surface. 
In this case the segment threshold is set as small as 


possible with respect to the physical characteristics 
of the object and the resolution of the image. 

Moreover, the optimal seed is found by considering a 
similarity measure based on the cosine of the angle 
between each couple of segments (see [4], eq. (11)). 

3. Next, an iterative region growing is performed. All 
the segments adjacent to the ones of the seed are ex- 
amined: if a segment is close enough (with respect to 
both its slope and intercept) to the plane which bet- 
ter approximates the seeds, it is added to the region. 
This process is iterated (considering the new region), 
until no more expansion is possible, and further it- 
erated on other seeds until the images is divided in 
planes. 

We must note that, as said before, not all the pixels belong 
to a plane at the end of the procedure: points affected by 
large noise, or regions where no actually planar surface is 
observable are not aggregated. 

Eventually, the best-fitting-plane for each region is ap- 
proximated with an horizontal one, for a first, imprecise 
simulation of the building roofs. 

3. EXPERIMENTAL RESULTS 

The interferometric SAR image used in this research cov- 
ers a portion of Santa Monica, in the metropolitan area 
of Los Angeles (see fig. 1). It is an IFSAR range image, 
that is to say an array of numbers representing the surface 
elevation with respect to a reference plane; so, this image 
already gives us the three-dimensional profile of the ur- 
ban surface. The data were obtained with the TOPSAR 
system, operated by NASA/JPL and mounted on a DC8 
plane. Ground truth was provided by a field recognition 
of the buildings, and the measurement, as accurate as 
possible, of their footprints and heights. 

Fig. 2 shows that from these data the most relevant 
built structures have been individuated and extracted. 
The importance of this result is twofold. First, each of 
the buildings is reconstructed and most of the noise or 
shadowing/layover effect has been eliminated. Second, 
the extracted structure are completely isolated one from 
the others, allowing some sort of rasterization of the orig- 
inal image. Moreover, by grouping the range values into 
consistent, structures, we may study their distribution, to 
qualify the clustering results with respect to the approxi- 
mate shape that we compute for each building. In fig. 3 a 
single building, together with a photo from the ground is 
shown to provide a qualitative assessment of the proposed 
approach. 

From a quantitative point of view, Table 1 compares the 
heights (in meters) of the tallest buildings in the studied 
area with those measured. The accuracy is almost within 




Figure 3: An example of extracted building: its 3D profile 
compared with a photo. 


the precision of the TOPSAR sensor (±2.5 m), revealing 
that the inaccuracies introduced by the grouping algo- 
rithm are negligible. Worst results are obtained, as it is 
easy to see in fig. 2, considering the areas of the buildings. 
We found that they are heavily underestimated, but also 
some strange situations occur. For instance, the layover 
effect on the farthest building in fig. 1 has been aggre- 
gated in a very large object that widely overestimate the 
building that is hidden in it. Furthermore, the two very 
small buildings indicated are indeed a single one, erro- 
neously split both due to the radar effects and problems 
in the detection algorithm. 

Indeed, the idea to take as primitives the image line 
segments introduce a privileged direction in the segmen- 
tation procedure. This choice has no influence on the 
results when the surfaces to be retrieved are large with 
respect to the starting segment primitives, as in the orig- 
inal application [4]. In our IFSAR image, instead, when 
three or four segments a few pixels long constitute a build- 
ing, this direction must be carefully chosen. Therefore, if 
the environment to be analyzed (like in almost any urban 
situation) presents object with different orientations, the 
detection and reconstruction accuracy may be a decreas- 
ing function of the angle between the segmentation and 
the building direction. 


Table 1: Actual and measured heights of the buildings 
extracted (mean error = 2.2 m, a = 4.9 m). 


building 

height 

IFSAR height 

error 

Coastal Federal Bank 

81 

86 

-5 

World Savings 

110 

99 

-11 

11755 Wilshire 

98 

99 

+1 

Barrington Plaza 

74 

71 

-3 

11645 Wilshire 

45 

49 

+4 


4. CONCLUSIONS 

This work shows that it is possible to extract buildings 
from TOPSAR data by means of a suitably changed ma- 
chine vision approach. The algorithm was applied to the 
detection of the major structures in a part of Wilshire 
Boulevard, S. Monica, with excellent results with respect 
to the retrieved heights, and some underestimate with re- 
spect to their footprints. 

Future developments could be pre-processing algorithms 
to eliminate as much as possible the multiple scattering 
effects on the radar backscattered signal, before apply- 
ing the clustering step. Indeed, many of this erroneous 
measures may be discarded by the detection procedure, 
but too large areas may also provide false targets, even of 
large dimensions. Moreover, a more refined version of the 
same algorithm, based on a direct plane fitting is under 
development. 
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